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An  algorithm  is  presented  for  the  rapid  evaluation  of  expressions  of  the  form  Qfc *  e‘UkX,  at  the 

points  which  are  arbitrarily  distributed  in  the  interval  [— 7r, x].  The  algorithm  is  based 

on  trigonometric  interpolation  techniques  and  in  most  cases  of  practical  interest,  evaluation  of  the 
above  sum  at  m  points  requires  0(N  •  log  N  +  m)  operations.  This  problem  can  be  viewed  as  a 
generalization  of  the  Discrete  Fourier  Transform  and  the  scheme  of  the  paper  is  widely  applicable 
to  many  problems  encountered  in  mathematics,  science  and  engineering. 
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1  Introduction 


/ 


'  Fourier  techniques  have  been  a  popular  analytical  tool  in  the  study  of  physics  and  engineering 
for  more  than  two  centuries.  A  reason  for  the  usefulness  of  such  techniques  is  that  -tie  trigono¬ 
metric  functions^aSv**  are  eigenfunctions  of  the  differentiation  operator  and  can  be  effectively 
used  to  model  solutions  of  differential  equations  which  arise  in  the  fields  mentioned  above. 

With  the  arrival  of  digital  computers,  it  became  theoretically  possible  to  calculate  the 
Fourier  series  and  Fourier  transform  of  a  function  numerically.  This  was  unrealistic  in  practice 
however  owing  to  the  prohibitive  complexity  of  even  modestly  sized  problems.  A  major  break¬ 
through  in  overcoming  this  difficulty  was  the  development  of  the  Fast  Fourier  Transform  (FFT) 
algorithm  in  the  1960s  which  established  Fourier  analysis  as  a  useful  and  practical  numerical 
tool. /n^\  A 

^Lin  this  paperwe  present$an  algorithm  for  the  rapid  evaluation  of  expressions  of  rips  form  4 

^  N 

k=  1 


piw*r 


(1) 


where  uj6R  and  a*  €  C  for  k  =  1, . . .,  N,  and  x  €  [— it,  7r] . 

The  performance  of  the  scheme  is  dependent  on  the  distribution  of  frequencies,  and  we  let 
n  denote  the  smallest  power  of  2  such  that  Uk  6  [— ?,  § j  f°r  a-U  Our  algorithm  then  requires 
a  number  of  arithmetic  operations  proportional  to 

n  •  log2  n  +  (N  +  m)  •  Ioge  Q J  (2) 

to  evaluate  the  sum  at  m  arbitrary  points  in  [-ir,  tt]  where  £  is  our  required  accuracy  for  the 
results.  In  many  cases  of  practical  interest  it  turns  out  that  n  ~  N  and  the  operation  count 
reduces  to 

0{N  -log,  N  +  (N  +  m)-loge(j))  (3) 

giving  us  a  significant  improvement  over  the  N  -m  operations  required  for  the  direct  evaluation. 

The  problem  as  described  above  can  be  viewed  as  a  generalization  of  the  discrete  Fourier 
transform,  which  is  defined  by  the  equations 


Fi  =  Jr  E  A  •  e-w,  j  =  0, . . . ,  N  -  1  (4) 

1  k=0 

for  a  given  sequence  of  N  numbers  /*  G  C.  In  this  special  case  of  (1),  the  frequencies  u>k  are 
integers  and  the  points  Xj  are  equally  spaced  in  [0,27r],  Under  these  conditions,  the  matrix 
representation  of  the  Fourier  kernel  has  a  simple  structure  which  can  be  exploited,  and  the 
well  known  and  highly  efficient  Classical  FFT  algorithm  provides  a  way  of  evaluating  the  N 
sums  (4)  in  0(N •  log  N)  arithmetic  operations  with  a  very  small  constant  as  opposed  to  0(N2) 
operations  for  the  direct  evaluation.  There  are  in  fact  a  variety  of  such  schemes,  but  all  are 
purely  algebraic  in  nature. 


In  the  more  general  case,  however,  the  underlying  structure  of  the  matrix  is  not  so  easily 
exploitable.  The  algorithm  of  this  paper  makes  use  of  some  results  from  approximation  theory 
coupled  with  existing  FFT  techniques  to  give  a  very  versatile  solution  for  the  generalized 
problem. 

The  plan  of  the  paper  is  as  follows.  We  start  in  section  2  with  some  results  from  analysis  and 
approximation  theory  which  are  used  in  the  design  of  the  algorithm.  An  exact  statement  of  the 
problem  in  section  3  is  then  followed  by  an  informal  description  of  the  algorithm  in  section  4. 
In  section  5  we  introduce  some  notation  which  is  used  in  a  more  detailed  description  of  the 
algorithm  in  section  6.  In  section  7  we  describe  some  modifications  to  the  scheme  to  improve 
its  performance  for  certain  frequency  distributions.  The  results  of  our  numerical  experiments 
are  presented  in  section  8  to  illustrate  the  behavior  and  performance  of  the  algorithm.  Finally, 
section  9  lists  some  generalizations  of  the  method  and  some  conclusions. 

2  Mathematical  and  Numerical  Preliminaries 

2.1  Analytical  Tools 

The  following  well-known  lemma  gives  us  the  convergence  rate  of  Fourier  series  and  can  be 
found  in  slightly  different  forms  in,  for  example,  [2,  4j. 

Lemma  2.1  Let  f  :&•-*  C  be  a  2ir -periodic  function  with  a  Fourier  series  expansion 

m=  £  cke'kd  (5) 

OO 

If  f  has  a  piecewise  continuous  p-th  derivative  then 

«  =  (6) 
Another  standard  result  is  the  translation-invariance  of  a  Gaussian  integral  (see  e.g.[lj). 
Lemma  2.2  Let  b,  u,  v  G  R.  Then 

2.2  Relevant  facts  from  Approximation  Theory 

The  principal  numerical  tool  of  this  paper  is  based  on  one  simple  observation  to  which  this 
section  is  devoted. 

Consider  the  2:r-periodic  function  given  by  e'“  on  the  interval  [-jt.tt]  with  non-integer 
frequency,  c.  This  is  discontinuous  at  odd  integer  multiples  of  *  and  from  lemma  2.1  its 
spectrum  decays  like  1/fc.  The  Fourier  series  of  such  a  function  thus  has  poor  convergence 
properties.  A  standard  technique  for  substantially  improving  the  convergence  rate  is  to  multiply 
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the  function  by  a  Gaussian  bell  e~bx2  where  b  is  chosen  such  that  e~b*2  =  e  for  a  specified  error 
tolerance  e  >  0,  giving  us 

*«**g^.  (8) 

The  effect  of  this  is  to  smoothly  force  the  discontinuity  to  be  small,  thereby  increasing  the 
number  of  continuous  derivatives  at  the  ends  of  the  interval  to  a  precision  s  while  preserving 
the  continuity  of  all  derivatives  away  from  these  ends. 

We  consider  now  the  properties  of  the  new  function, 

G(x;b,c)  =  e-bx2  -eicx.  (9) 

G  still  has  a  discontinuity  of  size  ~  e  at  x  =  ±ir.  Let  us  define  a  slightly  perturbed  function 
by 

G(x)  =  (e~bs2  -  e)  ■  eicx  (10) 

=  G(*) -«•«**.  (11) 

Then  G(x)  has  the  following  properties: 

•  |G(x)  —  G(x)|  =  c  for  x  6  [— ff,ir] 

•  G  is  continuous  and  G(-r)  =  G(x)  =  0 

•  G  has  a  piecewise  continuous  first  derivative  with  a  single  discontinuity  of  size  ~  s  at  ±x. 


By  lemma  2.1,  we  can  write 


=  £  Ske' 


where  the  decay  like  0(l/k2). 

This  suggests  that  if  we  keep  only  those  coefficients  which  are  bigger  than  s  the  truncation 
error  incurred  will  be  0(e).  As  G  is  smooth  except  for  a  small  discontinuity  in  its  derivative, 
we  expect  too  that  the  number  of  such  coefficients  is  small. 

A  precise  statement  of  this  fact  along  with  more  detailed  error  analysis  are  contained  in  the 
following  theorems. 

Theorem  2.3  Let  G(x)  be  defined  by  (10)  for  a  given  tolerance  e  >  0.  Then, 


where 


G(x)=  £  hkeikx  +  0(e) 

k=- oo 


hk  =  *  e-(c-*)s/46. 

ly/hm 


3 


Proof.  The  Fourier  coefficients  for  G  are  given  analytically  by  the  formula 
2*gk  =  J  G{x)e~'kxdx 

=  r  G(x)e~iksdx  -  r*  G(x)e~ikxdx  -  f°°  G{x)e~ikxdx 

7-00  J -OO  J  7T 


-ifcrdx 


The  first  integral  can  be  obtained  explicitly  by  completing  the  square  and  using  Lemma  2.2 
as  follows 


r  G{x)e~ikxdx  =  r  e 

7—00  7 — oo 

-  f « 


— 6r2+«ci-ifcr  . 


e-Mx-.(c-fc)/2)J-(c-fc)2/46rfl 

OO 

l.e-ic-k)2/ Ab 

b 


We  also  have 


f  *  G(x)e-ikxdx  +  f°°G(x)e-ikxdx 

7 — oo  7ir 

=  f  *  e~bx2  e'(e~k)xdx  4-  [°°  e~bx3  e«c~Vxdx 
7-oo  7ir 

=  2  f  e~bxi  cos ((c  —  k)x)dx 

J-K 

=  [e_AxJ  sin((c  -  fc)x)]~  +  J  xe~bx2  s\n((c  -  k)x)dx 

=  ~irhsi,l((-c  ~  *),r) + J ' 0  (ic^w)  ■ 


Finally, 


£  •  f  e,cxe~ikxdx  =  -^-rsin ((c  -  k)ir) 
J -it  c  —  k 


Substituting  for  these  integrals  in  the  formula  (16),  we  find  that  the  1  /(c  —  k)  terms  cancel, 


and  that 

/  |  \ 

(17) 

where  h*  are  defined  by 

A*  =  i  j^G(z)e-^dx 

(IS) 

—  .  1  ..  e-(c-fe)J/4fc 

“  2v^ 

(19) 

4 


Substituting  now  for  gk  in  (12), 


G(z)=  £  hke'kx  +  R(x)  (20) 

ks~  oo 

where 

(21) 

for  some  set  of  coefficients  rk. 

The  size  of  this  error  can  be  bounded  as  follows 

\R(x)\<er  £  (~Ij2<£r'  (22) 

where  r,  r'  are  real  constants. 

VVe  can  thus  rewrite  (20)  as 

G(x)=  £]  hfceifc*  +  0(e).  (23) 


As  a  consequence  of  the  fact  that  \G  —  Gj  =  e  and  the  result  of  the  previous  theorem,  we 
have 

Corollary  2.4  Let  G{x)  =  e~bx*  e,cx  and  hk  be  defined  by  (14)  for  a  given  tolerance  e  >  0. 
Then 

G(x)=  £]  hke'kx  +  0(e)  (24) 

JfSS-OO 

for  x  e  [-JT,  t). 

The  coefficients  hk  themselves  look  like  a  Gaussian  with  a  peak  at  the  nearest  integer  to  c 
which  we  denote  by  kc.  They  decay  superalgebraically,  i.e.  faster  than  any  finite  power  of  1  jk. 
Suppose  we  now  keep  only  the  q  +  1  largest  terms  in  the  series,  where 

e-((<?+i)/2)a/46  <  £  (25) 

Rearranging  and  substituting  for  b  from  (8),  we  get 

?>2ff-1og,(I)].  (») 

Note  that  the  bandwidth  q  is  independent  of  the  frequency  c. 

The  next  theorem  gives  an  estimate  for  the  truncation  error  of  the  series  under  these 
conditions. 


5 


(27) 


Theorem  2.5  Let  G(x)  —  e~bx*  e'01 .  Then,  in  the  notation  of  this  section, 

kc+q/2 

G(x)=  £  hkeik*+0(e) 

k—kc-q/2 

for  x  e  [-ir,ir]. 

Proof.  The  truncation  error  for  the  series  is  given  by 


oo  fcc-<?/2-l 

E  +  E 


kskc+q/2+l 


*=  —  OO 


<  — L_  V'  ^e-(c-fc)J/4i  +  e-(c+fe)J/4l>) 

2\Zbir  .  ,  , 


fe=ke+?/2+x 
00 


<  4=  E 


*=(7/2+1 
oo 


_  _L_  V  g"(fc+?/2 +D*/46 

_L,,-(*/2+i)a/4*  jr  p-k*/4b 


k—Q 


7b ;(' 


£  /  \/467r  \ 

1  +  — J 


Ioge(l/£) 


+  1 


In  summary,  any  function  of  the  form  e-l>z  •  e,CI  can  be  accurately  represented  using  a 
small  number  of  terms  of  the  form  e,kx. 

Multiplying  both  sides  of  the  formula  (27)  by  e4*3,  we  obtain  an  approximation  to  the 
original  function  on  the  subinterval  [-§,  §]: 

kc+f/2 

e,cx  =  ebx2  ■  ^2  hkeikx  +  0{k  •  e)  (28) 

k=kc~q/2 


where 


(29) 


is  the  maximum  size  of  e4*3  on  the  subinterval. 

The  worst-case  estimates  obtained  above  provide  us  with  rather  pessimistic  upper  bounds 
for  the  truncation  errors. 


We  estimated  the  actual  error  in  approximating  eicx  using  terms  of  the  form  ebx2e'kx  for 
different  choices  of  tolerance  e  and  number  of  terms  q.  The  results  are  presented  in  Table  1. 
We  only  tested  the  approximation  for  the  constant  function  f(x)  =  1  corresponding  to  the 
frequency  c  =  0  as  the  spectra  for  all  other  frequencies  are  translates  of  the  spectrum  for  this 
one.  The  error  estimates  are  thus  valid  for  all  real  frequencies  c.  The  approximation 

q/2 

H(x)  =  ebx2-  £  hkeikx  (30) 

*=-7/2 

to  1  with  hk  defined  as  in  equation  (14)  was  computed  at  n  =  1000  equally  spaced  nodes  xk  in 
[— x/2,rr/2]  and  the  entries  in  the  table  are  defined  as  follows: 


•  The  numbers  of  terms,  q,  we  used  are  given  by 


q  =  2 


2  . 

-  •  loge 

X 


G)1 


+  4  •  t 


for  i  =  0, 1,2. 

•  E°°  is  the  maximum  absolute  error  defined  by 


E°°  =  max  |1  -  ff(xfc)| 

!<*<n 


•  Ex  is  the  mean  absolute  error  defined  by 


n*=i 

Table  1: 


(31) 


(32) 


(33) 


e 

q 

E°° 

E1 

1  E-4 

12 

0.671  E-05 

0.202  E-05 

16 

0.100  E-07 

0.581  E-09 

20 

0.100  E-07 

0.542  E-09 

1  E-6 

18 

0.264  E-06 

0.311  E-07 

22 

0.101  E-09 

0.118  E-10 

26 

0.100  E-ll 

0.367  E-13 

1  E-8 

24 

0.332  E-08 

0.478  E-09 

28 

0.197  E-ll 

0.266  E-12 

32 

0.434  E-14 

0.815  E-15 

1  E-10 

30 

0.142  E-09 

0.924  E-ll 

34 

0.983  E-13 

0.660  E-14 

38 

0.140  E-13 

0.152  E-14 
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We  observe  from  the  above  results  that  in  all  our  tests, 

7/2 

|  eicx-ebx2-  £  hkeikx\<2s  (34) 

*=-7/2 

and  for  larger  q  the  truncation  errors  are  several  orders  of  magnitude  less  than  e.  The  error 
estimate  0(k  •  e)  in  (28)  can  thus  be  replaced  by  O(s)  for  most  practical  purposes. 

Finally,  by  means  of  a  simple  linear  transformation,  the  formula  (28)  generalizes  from  [-x,  x] 
to  any  interval  [a  —  d,  a  +  d]  to  give  us  the  result  upon  which  the  algorithm  is  based. 

Corollary  2.6  Let  c  be  any  real  frequency,  a,d  be  real  numbers  with  d  >  0  and  i  >  0  be  a 
given  tolerance.  Then,  in  the  notation  of  this  section, 

fee +7/2 

e*“  _  e6((r-aW<i)3  .  £  hkelk^-a)irld  +  0(e)  (35) 

k=kc-q/2 


for  x  6  [a  —  d/2,  a  +  d/2]. 

3  Exact  Statement  of  the  Problem 

In  the  following  sections,  we  will  assume  that: 

1.  {wx, . .  .,w/v}  and  {xi,...,xm}  are  finite  sequences  of  real  numbers 

2.  n  is  an  integer  such  that  uk  €  (— j,  f]  for  A:  =  1, . . . ,  N 

3.  -x  <  zi  <  •••  <  xm  <  x 

4.  {osj, . .  .,Qyv)  is  a  finite  sequence  of  complex  numbers 

5.  We  wish  to  evaluate  the  sums 

N 

Sa,U*])='£,«k-e'UkX>  (36) 

fc= i 

for  ji  =  with  a  relative  accuracy  s  >  0.  More  precisely,  we  are  looking  for  a  set 

of  numbers  5(xy)  such  that 

|5(x;)  -  S*M{zj)\  <  e  (37) 

n=i  |a*i 


S 


4  Informal  Description  of  the  Algorithm 

The  results  of  section  2  lead  us  to  observe  that  the  linear  transformation  described  by  equation 
(36)  has  an  approximate  sparse  factorization  to  a  relative  precision  e. 

The  factors  correspond  to  the  following  operations  which  are  performed  by  the  algorithm: 

1.  Obtain  approximate  Fourier  coefficients  for  e~bxi Sa,u{x)  on  the  interval  [-2x,2 x]. 

2.  Evaluate  this  Fourier  series  at  equally-spaced  points  in  [ — 2?r,  27r]  using  FFT. 

3.  Retrieve  the  approximate  values  of  Sa<ul(x)  at  equally  spaced  points  in  [— 7r,  7r]  by  multi¬ 
plying  by  eix 2 . 

4.  Split  [— jt,  7c]  into  small  equal  subintervals.  Extend  each  subinterval  to  twice  its  length 
and  multiply  the  approximate  values  of  SaM(x)  on  each  one  by  a  Gaussian  bell. 

5.  Apply  FFT  to  new  set  of  values  on  each  subinterval,  thereby  obtaining  short  interpolat¬ 
ing  trigonometric  polynomials  which  closely  approximate  e~b'^x~Ck'>  Sa<w(x)  on  each  h- th 
subinterval  whose  center  is  c* . 

6.  Evaluate  each  series  directly  at  the  relevant  points. 

7.  Retrieve  approximate  values  of  Sa<u(xj)  on  the  appropriate  fc-th  subinterval  by  multiply¬ 
ing  by 

Remarks. 

•  In  Step  1  a  matrix  P  is  applied  to  the  vector  of  coefficients  Qfc.  Due  to  (27),  this  matrix 
is  given  analytically  and  is  banded  to  a  precision  e  with  bandwidth  q. 

•  Steps  3-7  can  be  combined  and  represented  by  a  block-diagonal  matrix  Q  where  the  k- th 
block  is  the  product  of  the  linear  transformations  on  the  k- th  subinterval. 

•  The  algorithm  is  divided  into  two  parts: 

—  Initialization,  which  precomputes  and  stores  the  matrix  operators  P  and  Q  in  terms 
of  {wjt}  and  {ij}. 

-  Evaluation,  which  applies  the  sequence  of  linear  transformations  to  the  vector  {a*} 
to  obtain  approximate  values  of  the  series  at  the  specified  points. 

For  many  applications,  the  frequencies  and  points  are  fixed  and  we  wish  to  evaluate 
5a,w(i)  repeatedly  for  different  sets  of  {ait}.  With  the  above  formulation  the  initializa¬ 
tion  only  needs  to  be  performed  once  for  any  number  of  subsequent  evaluations  giving 
considerable  time  savings  in  such  cases. 
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•  It  remains  for  us  to  determine  choices  for  the  number  of  equally  spaced  points  needed  in 
step  2  and  for  the  subinterval  size  in  step  4  such  that  the  operation  count  is  minimized 
under  the  constraint  that  the  interpolation  errors  are  within  our  desired  tolerance. 

The  FFT  requires  at  least  2  points  per  wavelength  to  resolve  a  particular  frequency.  The 
highest  frequency  term  in  the  original  series  has  a  maximum  of  n  wavelengths  on  the 
interval  [— 2jt,2jt].  If  the  subinterval  size  is  such  that  the  highest  frequency  term  has 
W  wavelengths,  the  Fourier  series  for  e~b'^x~Ck'>  SatUI(x)  on  each  subinterval  will  have 
W  +  q/2  as  its  highest  frequency.  We  thus  need  2W  +  q  points  on  each  subinterval,  or 
at  least  2  +  q/W  points  per  wavelength  to  resolve  this  highest  frequency  and  achieve  our 
desired  accuracy.  Our  investigations  showed  that  an  optimal  choice  is  a  subinterval  size 
such  that  W  =  q/2,  thus  making  our  requirements  4  points  per  wavelength,  4n  points  on 
( — 27t, 2tt]  and  2 q  points  on  each  extended  subinterval.  As  there  are  2n  points  in  [-:r,x] 
and  q  points  in  each  unextended  subinterval,  the  number  of  subintervals  is  given  by  2 n/q. 

5  Notation 


In  this  section  we  introduce  some  notation  to  be  used  in  the  detailed  description  of  the  algo¬ 
rithm. 

We  assume  that  the  problem  to  be  solved  is  that  as  described  in  section  3  above.  We  assume 
too  that  e  >  0  is  a  given  real  error  tolerance. 

First  we  use  the  result  2.6  of  section  2.2  with  a  =  2x  to  obtain  an  approximation  to  the 
series  on  [ — tt,  7r|. 

We  define  p*  to  be  the  nearest  integer  to  2uk  for  k  =  1, . . .,  N . 

The  real  number  b  is  defined  such  that  e~b =  e,  giving  us 


L  l°g«(l/0 

‘=  '4*'  ' 


From  equation  (27),  each  term  e~bx 2  ■e'UJkX  can  be  accurately  represented  by  a  short  trigono¬ 
metric  polynomial  whose  dominant  frequency  is  pk ■  We  define  sets  of  numbers  {Pji },...,  {.P^v} 
to  be  the  coefficients  for  each  such  polynomial  according  to  the  expression 

Pjk  =  —L =e-^k~(Pk+})/2)3/4b  (39) 

4vf>7T 


for  A:  =  l,...,jV  and  j  =  -q/2, . .  .,q/2.  Here  we  define  the  bandwidth  q  to  be  the 
power  of  2  such  that 


q  >  2  • 


'2 

7T 


smallest 

(40) 


as  in  (26),  so  that  due  to  corollary  2.6, 


<7/2 

_  g6r*  .  £  pjk  .  e.(p*+j)x/2  +  0(£)  (41.) 

j  =  -,/2 


10 


for  k  =  1  and  x  €  [-t, 7r]. 

We  now  define  a  function  T{x),  polynomial  coefficients  {/?;},  and  n  to  be  a  power  of  2  such 
that 

N  ?/2  n 

T( x)  =  £afc  2  ^-e,(pk+J>/2=  E  ^e,Jr/2-  (42) 

fc=l  J=-?/2  j'=-n 

We  also  define  a  function 

£f(*)  =  e6*2  •  T(x)  (43) 

Observation  5.1  U(x)  can  be  viewed  as  an  approximation  to  50,w(i)  and  furthermore,  due 
to  (41), (4%), (4$)  and  the  triangle  inequality, 

N 

\SaM  ~U(x)\  =  0(e)-  J>;|  (44) 

i=  1 

for  all  x  €  [-ir,  tt). 

Let  {tj}  be  a  set  of  4n  equally  spaced  points  in  [— 2?r,27r]  defined  by 

tj  =  -2x  +  (j  -  l)7r/n.  (45) 

for  j  —  1, . . .  ,4n. 

We  next  consider  the  result  2.6  applied  to  each  of  a  set  of  small  subintervals  of  [-r,  jt],  and 
introduce  a  notation  for  such  subintervals  and  their  associated  expansions. 

Let  M  =  2n/q  denote  the  number  of  subintervals  on  [— jt.tt]. 

We  define  the  set  of  subintervals  {Ak}  for  k  =  1  to  be  an  equal  partitioning  of 

[ — 7r,  7r]  by  the  formula 

Ak  =  [cfc  -  ^,Ck  +  |]  (46) 

where  the  length  of  each  subinterval  is  a  =  2 x/M,  and  the  center  of  each  is  given  by 

Ck  =  -*•  +  (fc  -  ^)ff.  (47) 

The  extended  subintervals  { Bk }  are  then  defined  for  k  =  1, . . .,  M  by 

Bk  =  [cfc  -  a,  Ck  +  cr]  (48) 

so  that  each  Ak  is  of  length  a  and  each  Bk  is  of  length  2a,  both  centered  at  c*. 

The  scaled  Gaussian  bell  e“^(s~c*)2  on  each  Bk  is  characterized  by  a  number  b'  such  that 

L/-2  .  . 

e  =  e,  giving  us 

y  =  l°6.(l/£)  =  4^  ^  (49) 

a2  a2 

The  subintervals  have  been  chosen  such  that  on  each  Bk  the  Fourier  series  approximation 
to  e_6'(E-e<>)2  .  Sa>u(x)  to  a  precision  e  has  2 q  terms. 
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For  k  =  1, . . M  let  the  set  of  points  {^'}  be  the  subset  of  {£,  }  which  lie  in  Bk  defined  by 

(50) 


fk  _  „  ,  3° 
tj  =  Ck  +  — 
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for  j  =  -q,...,q  -  1. 

Let  us  now  define  the  sets  of  numbers  {Vf  }, . . . ,  {V^}  by 

Vf  =  .  £A(tJ) 


(51) 

(52) 


for  j  =  1. 

We  denote  by  {7*}  the  Discrete  Fourier  Transform  of  each  {V;fc}  viewed  as  a  sequence  in  j 
for  each  fixed  k ,  i.e. 

it  =  ~  £  V/.e~M^  (53) 

"  i=— <? 

for  j  =  -9, .  ..,q-  1  and  A:  =  1, . . . ,  M . 

Observation  5.2  For  k  =  1  the  set  of  coefficients  {7*}  defines  a  q-th  order  interpo¬ 

lating  trigonometric  polynomial  which  can  be  viewed  as  an  approximation  to  e~b '^x~Ck^ SaiU/(x) 
on  the  interval  Bk.  Furthermore,  defining  a  function  on  [-tt,x]  by 


(j-i 


S(x)  =  e6'(r  Cfc)2  •  it  '  ell(x~Ck^/a  when  x  6  Afc, 
t=- 7 


we  get 


|S(x)  -  5a>w(*)| 


e;v=1  i«ii 


=  0(e) 


(54) 


(55) 


for  all  x  6  {— Jr,  t]. 


We  partition  the  set  of  points  {xj}  according  to  subinterval,  and  for  each  k  =  1, . . M  we 
define  {i*,. . .,1*^}  to  be  the  subset  consisting  of  all  points  which  lie  in  .4*.  The  integer  n* 
denotes  the  number  of  points  in  the  fc-th  subinterval. 

We  notice  now  that  (52), (53)  and  (54)  constitute  a  sequence  of  linear  transformations  which 
can  be  combined  into  a  single  matrix  on  each  subinterval.  To  this  end  we  define 


<7-1 


g*  _  e»'(r*-c n)3  _  £  e<p(**-c*W*  J_  e-2x<pl/2q  .  ,  ei>(jf)2 

p=-<7  ^ 

for  k  =  1, . . . ,  J/,  j  =  1, . . .,  n*  and  /  =  -<7, . . .  ,<7  -  1,  so  that, 


(56) 


12 


Observation  5.3  In  the  notation  of  this  section, 

s(*i)  =  E  <3*1  •  T(t?)  (57) 

/=-? 

for  k  =  1,...,AT  and.;  = 


A  closer  inspection,  of  (56)  reveals  that  the  sum  over  p  is  in  fact  a  geometrical  series  and 
substituting  for 

j  _  (tf  ~  cfc) 
q  a 

from  (50),  the  sum  can  be  written  as 


.  ,,  ?-i 


-ck)/a)-ipirl/q  _  ^  gip(rJ-tf W<7 

(58) 

P=-<7 

P=-9 

Writing 

=  (Xi  ~  *?)»/*. 

(59) 

the  sum  can  be  further  simplified  to 

7-1  k 

T  e'vys‘  = 

/  «-*»?.  _  \ 

(60) 

v--<t 

\  1  -  e‘yj*  / 

- 

«-'”V2 .  sta(^'>  if  *  o 

sin(y*/2)  * 

(61) 

and 

9-1 

E 

evy>1  =  2g  if  =  0. 

(62) 

p=— ? 


This  latter  equation  corresponds  to  the  case  when  z*  =  for  some y,  A’,/,  i.e.  when  one  of  the 
points  for  evaluation  coincides  with  one  of  the  equally  spaced  points. 

The  definition  (56)  can  thus  be  rewritten  as 


e6'(x*-c*)*  .  JL  .  g— *v*«/2 


sin(gy^) 


e-b'(t!-ck)\eb(t*)> 


if  y*,  *  0 
if  yj)  =  0 


(63) 
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6  Detailed  Description  of  the  Algorithm 

6.1  Numerical  Procedure 

This  subsection  contains  a  step  by  step  breakdown  of  the  details  of  the  algorithm. 
Initialization  Phase 

Comment  [Input  to  this  phase  are  the  vectors  {u/j, . .  .,un}  and  {xi,...,xm}  and  an  error 
tolerance  £.] 

Step  1. 

Comment  [Choose  parameters.  Compute  nearest  integer  frequencies  {pk}-  Evaluate  Fourier 
coefficients  {Pjk}  for  each  term  e~bx2 e'WkX\ 

Determine  n 

Compute  b  and  q  in  terms  of  the  tolerance  £ 
do  fc  =  1,  iV 
pk  =  int(2uk ) 
do  j  =  ~q/2,q/2 

Compute  Pjk  according  to  (39) 
end  do 
end  do 

Step  2. 

Comment  [Geometrical  preprocessing.] 

Set  number  of  subintervals,  M  =  2 n/q. 

Compute  subinterval  length,  a  =  2 t/M,  and  b'  =  b-  4tt2 /a2. 
do  k  =  1,  M 

Construct  subintervals  .4*  and  Bk  and  their  center  Cfc. 

Construct  subset  (ij, . .  .,z^} 
end  do 

Comment  [Compute  elements  Q of  the  nkX2q  matrix  corresponding  to  each  &-th  subinterval.] 

do  k  ~  1,  M 
do  j  =  l,njt 

do  l  =  -q,  q  -  1 

Compute  according  to  (63) 

end  do 
end  do 
end  do 

End  of  Initialization  Phase 


Evaluation  Phase 

Comment  [Input  to  this  phase  is  the  vector  {ax, . . . ,  cw}.] 

Step  1. 

Comment  [Compute  Fourier  coefficients  (3j  for  e-6lS5aiW(x).] 

do  k  =  1,  jV 

do  j  =  -q/2,q/2 

fipk+j  =  Ppk+j  +  '  ak 

end  do 
end  do 


Step  2. 

Comment  [Evaluate  this  Fourier  Series  at  equispaced  points  on  [-2jr,27r]  using  inverse  FFT. 
Determine  subsets  of  the  resulting  values  according  to  subinterval.] 

Compute  T(tj)  =  E*=_n/2^  ■  e'ht>'2. 
do  k  =  1,  M 

Construct  subset  {T(tl9), . . . , )} 
end  do 


Step  3. 

Comment  [On  each  subinterval,  compute  approximations  to  the  series  at  points  for  evaluation 
in  terms  of  the  values  at  equally  spaced  points.] 

do  k  =  1,  M 
do  j  =  l,nfc 

s(^)  =  ZV-,Qk,rT(t!) 

end  do 
end  do 

End  of  Evaluation  Phase 
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6.2  Complexity  Analysis 

In  this  subsection  we  present  the  operation  counts  for  each  stage  of  the  algorithm. 


Step 

Operation 

Explanation 

Number 

Count 

Initialization 

l 

O(N-q) 

Calculation  of  nearest  integer  to  each  of  N  frequencies.  Fourier 
coefficients  Pjk  are  computed.  There  are  q  of  these  for  each  of  the 
N  terms  in  the  original  series. 

2 

0(m  •  q) 

Each  of  m  points  Xj  is  assigned  to  a  single  subinterval  Ajt-  Coef¬ 
ficients  Q yj  are  computed.  There  are  2 q  of  these  for  each  of  the 
m  points. 

Evaluation 

1 

0{N  ■  q) 

Computation  of  the  Fourier  coefficients  Each  of  the  N  coeffi¬ 

cients  at  contributes  to  q  of  these. 

2 

0(4n  log4n) 

Inverse  FFT  to  evaluate  the  approximating  Fourier  series  at  4n 
points. 

3 

0(m  -  q) 

Computation  of  the  approximation  S(xj )  at  each  of  the  m  points. 

Each  of  these  is  given  by  a  linear  combination  of  the  values  at  2 q 
equally  spaced  points. 


Remark.  In  the  description  of  the  algorithm,  n  was  chosen  to  be  a  power  of  2.  This  is  not 
an  essential  requirement,  but  our  reason  for  making  such  a  choice  is  that  step  2  of  our  scheme 
consists  of  an  FFT  of  size  4n,  and  the  FFT  algorithm  is  most  efficient  when  n  is  a  power  of  2. 

Adding  the  operation  counts  for  each  step  of  the  algorithm,  we  obtain  the  estimates 

A  •  N  •  q  +  B  ■  m  ■  q  (64) 

for  the  initialization  time,  and 

C  ■  n  ■  \ogn  +  D  ■  N  •  q  +  E  ■  m  ■  q  (65) 

for  the  evaluation  time  where  the  coefficients  A,  B,C,  D,  E  depend  on  the  computer  system, 
language,  implementation,  etc. 

In  many  cases  of  practical  interest  n  ~  jV  and  using  the  fact  that  q  ~  log(£)  the  CPU  time 
estimate  for  evaluation  reduces  to  the  form 

C  •  N  •  log  N  +  (D  •  N  +  E  •  m)  •  log  ^  .  (66) 
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The  storage  requirements  of  the  algorithm  are  also  an  important  characteristic.  From  the 
above  explanations  for  the  initialization  steps  the  asymptotic  storage  requirements  are  of  the 
form 

A  ■  N  •  q  +  n  •  m  ■  q  (67) 

where  once  again  the  coefficients  \,p,  are  software-  and  hardware- dependent. 

7  Modifications  for  Special  Cases 

In  certain  cases  of  interest,  the  nature  of  the  frequency  distribution  can  be  exploited  to  reduce 
the  operation  count  of  the  algorithm.  Here  we  present  informal  outlines  of  such  modifications 
to  the  scheme  in  two  particular  situations. 

7.1  Evaluation  of  Fourier  Series  at  Arbitrary  Points 

If  the  frequencies  uk  are  all  integers,  the  problem  reduces  to  the  evaluation  of  a  Fourier  series 
on  [— Jr,  jr].  We  can  obtain  the  values  at  equally  spaced  points  on  this  interval  by  just  using 
an  FFT  with  4  points  per  wavelength  and  without  having  to  first  multiply  the  series  by  a 
Gaussian  bell.  This  eliminates  the  first  step  and  roughly  halves  the  required  time  for  the 
second  step.  Although  the  asymptotic  time  complexity  is  unchanged,  such  a  modification 
yields  an  anticipated  speedup  of  a  factor  of  2. 

7.2  Non-Homogeneous  Frequency  Distributions 

The  complexity  of  the  algorithm  is  given  in  terms  not  of  the  number  of  frequencies,  but  in 
terms  of  the  size,  n,  of  the  smallest  interval  containing  all  the  frequencies.  Thus,  for  some 
fixed  number  N  of  frequencies,  the  operation  count  will  be  much  lower  if  the  frequencies  are 
clustered  together  on  an  interval  of  size  smaller  than  N  than  if  they  are  widely  spaced  on  an 
interval  of  size  much  larger  than  N. 

In  some  cases  the  frequencies  may  be  distributed  in  a  highly  non-uniform  manner,  and  can 
be  grouped  into  widely  separated  clusters  on  the  real  line.  We  may  improve  the  algorithm’s 
performance  in  such  cases  by  treating  the  series  as  a  sum  of  separate  sub-series,  applying  the 
method  to  each  and  adding  the  results  at  the  end.  If  a  sub-series  has  2 q  or  fewer  terms,  then 
we  evaluate  it  directly.  We  also  use  the  following  simple  result: 

Observation  7.1  Let  a,d  be  real  numbers  with  d  >  0  and  suppose  uk  €  [a  -  d,  a  +  d]  for 
k  =  1 , . . . ,  N .  Then  we  can  write 

/v  N 

£  ak  ■  elUkX  =  eiax  £  ak  •  ei(w*'a)l.  (68) 

k=l  k=l 

For  the  series  on  the  left  hand  side,  n  ~  |a|  +  d  whereas  for  the  series  on  the  right  hand  side, 
all  the  frequencies  are  in  [—d,  d\,  so  n  ~  d,  giving  us  considerable  time  savings  when  |a|  3>  d. 
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Remark. 

This  algorithm  performs  well  when 

•  the  frequencies  within  a  cluster  are  close  together 

•  there  are  very  few  clusters 
and  not  so  well  if 

•  the  frequencies  are  widely  separated 

•  there  are  many  clusters. 

Most  cases  likely  to  be  encountered  in  practice  fall  in  the  first  category. 


8  Numerical  Results 


A  FORTRAN  implementation  of  the  algorithm  of  this  paper  has  been  written  and  consists 
of  two  main  subroutines,  the  first  implementing  the  initialization  stage,  and  the  second  the 
evaluation  stage. 

Our  numerical  experiments  were  conducted  on  the  Sun  Sparcstation  1  and  in  this  section 
we  present  the  results  from  some  of  our  investigations.  The  3  examples  below  illustrate  the 
performance  of  the  algorithm  when  applied  to  a  variety  of  input  data  for  problems  of  differ¬ 
ent  sizes  and  different  error  tolerances.  All  computations  were  performed  in  double  precision 
arithmetic  and  in  each  case  the  series  was  evaluated  by  the  direct  method  for  error  estimation 
and  timing  comparisons. 

A  description  of  the  entries  in  the  tables  follows,  where  N  is  the  number  of  terms  in  the 
series,  m  is  the  number  of  points  for  evaluation,  Sa>w{ik)  denotes  the  sum  (1)  evaluated  directly 
at  the  point  x*  and  S(xk)  denotes  our  approximation  to  the  sum  (1)  at  the  point  Xk  as  evaluated 
using  the  algorithm  of  this  paper. 


•  E°°  is  the  maximum  relative  error  at  any  point  of  evaluation  defined  by: 

coo  _  |5(xfc)  -  5a,w(xfc)| 

t,  —  max  - tj - 

!<*<m  EjLt  |0t;  | 

•  Ex  is  the  mean  relative  error  defined  by: 

ri  _  1  v  -  W**)l 

*  ~  m  ^  ^‘v 


(69) 


fc=i 


Ef=i  I*, I 


(70) 


•  ty jg  is  the  initialization  time  for  the  algorithm 

•  t™gl  is  the  CPU  time  for  the  subsequent  evaluation  stage  of  the  algorithm 
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•  tar  is  the  CPU  time  for  the  direct  evaluation  of  the  series 

•  tFFT  is  the  CPU  time  for  a  single  FFT  of  size  N. 


Example  1. 

Here  we  evaluate  a  Fourier  series  at  non-equally  spaced  points  by  choosing  frequencies  {wjt} 
and  points  {ij}  as  defined  by  the  formulae: 

uk  =  k  —  \  -  N/2  for  A:  =  1 . N  (71) 

Xj  =  — t  cos  — yjr)  for  j  =  1, . .  .,m  (72) 

and  the  coefficients  {ajt}  generated  randomly  on  the  unit  square  in  the  complex  plane  with 
3?(ait)  €  [0,1],  §(0!*)  e  [0,1].  The  method  for  such  a  problem  as  described  in  section  7  was 
employed  in  this  case.  Results  of  our  experiments  are  presented  in  Tables  2-4. 
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Table  2:  Numerical  Results  for  Example  1  with  s  =  10-2,  q  =  8 


N 

m 

E°° 

El 

HI 

HI 

tdir 

tF  FT 

64 

64 

0.127  E-03 

0.359  E-04 

0.02 

0.01 

0.04 

mm 

128 

128 

0.131  E-03 

0.247  E-04 

0.06 

0.02 

0.17 

n 

256 

256 

0.102  E-03 

0.176  E-04 

0.10 

0.05 

0.65 

0.009 

512 

512 

0.808  E-04 

0.130  E-04 

0.22 

0.10 

2.51 

0.021 

1024 

1024 

0.705  E-04 

0.950  E-05 

0.46 

0.20 

10.01 

0.047 

2048 

2048 

0.556  E-04 

0.659  E-05 

0.92 

0.43 

40.03 

0.102 

4096 

4096 

0.624  E-04 

0.469  E-05 

1.81 

0.97 

160.48 

0.219 

Table  3:  Numerical  Results  for  Example  1  with  e  =  10-5,  q  =  16 


N 

m 

E°° 

El 

HI 

*eva  l 

7 

tdir 

tpFT 

64 

64 

0.294  E-06 

0.523  E-07 

if»yj 

0.04 

0.002 

128 

128 

0.375  E-06 

0.421  E-07 

0.17 

0.004 

256 

256 

0.258  E-06 

0.198  E-07 

0.18 

MSI 

0.65 

0.009 

512 

512 

0.210  E-06 

0.192  E-07 

0.35 

2.51 

0.021 

1024 

1024 

0.144  E-06 

0.127  E-07 

0.74 

10.01 

0.047 

2048 

2048 

0.112  E-06 

0.935  E-08 

1.47 

3 

40.03 

0.102 

4096 

4096 

0.972  E-07 

0.629  E-08 

2.95 

1.41 

160.48 

0.219 

Table  4:  Numerical  Results  for  Example  1  with  e  =  10  10,  q  =  32 


N 

m 

E°° 

El 

HI 

HI 

tdir 

If  FT 

64 

64 

0.207  E-10 

0.190  E-ll 

0.08 

0.03 

0.04 

0.002 

128 

128 

0.318  E-10 

0.298  E-ll 

0.15 

0.05 

0.17 

0.004 

256 

256 

0.161  E-10 

0.112  E-ll 

0.30 

0.11 

0.65 

0.009 

512 

512 

0.162  E-10 

0.808  E-12 

0.60 

0.22 

2.51 

0.021 

1024 

1024 

0.920  E-ll 

0.526  E-12 

1.25 

0.46 

10.01 

0.047 

2048 

2048 

0.124  E-10 

0.379  E-12 

2.63 

0.93 

40.03 

0.102 

4096 

4096 

0.819  E-ll 

0.292  E-12 

5.36 

2.13 

160.48 

0.219 

Remark.  In  this  example,  direct  evaluation  of  the  trigonometric  polynomial  was  performed 
via  Horner’s  scheme  which  can  be  found  in  [3]  for  example. 
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Example  2. 

In  this  example,  the  frequencies  {w*}  were  randomly  distributed  on  the  interval  [-N/2,  N/2\, 
the  points  {ij}  randomly  distributed  on  [ — 7r, tt j  and  the  coefficients  {a*}  generated  randomly 
as  before  on  the  unit  square  in  the  complex  plane  with  &(acfc)  €  [0,1],  $(afe)  6  [0,1].  The 
results  are  presented  in  Tables  5-7. 

Table  5:  Numerical  Results  for  Example  2  with  e  =  10-2,  9  =  8 


Dir  time 


eval 


0.837  E-04 
0.499  E-04 
0.371  E-04 
0.258  E-04 
0.181  E-04 
0.134  E-04 
0.916  E-05 


N 

m 

E°° 

64 

64 

0.280  E-03 

128 

128 

0.295  E-03 

256 

256 

0.247  E-03 

512 

512 

0.127  E-03 

1024 

1024 

0.141  E-03 

2048 

2048 

0.137  E-03 

4096 

4096 

0.734  E-04 

Table  6:  Numerical  Resul 

N 

m 

E°° 

64 

64 

0.790  E-06 

128 

128 

0.745  E-06 

256 

256 

0.517  E-06 

512 

512 

0.349  E-06 

1024 

1024 

0.397  E-06 

2048 

2048 

0.250  E-06 

4096 

4096 

0.156  E-06 

Alg  time 

init 

eval 

0.06 

r 

.02 

0.11 

.05 

0.20 

.09 

0.46 

.19 

0.83 

1 

.37 

1.73 

.84 

3.47 

u 

.71 

r  Example  2  with  e  =  10  5, 


E 


Alg  time 

Dir  time 

init 

eval 

init 

eval 

0.134  E-06 
0.997  E-07 


0.09  0.03 


0.303  E-07  1.37 

0.180  E-07  2.72 


Table  7:  Numerical  Results  for  Example  2  with  s  =  10-10,  q  =  32 


0.436  E-10 
0.397  E-10 
0.388  E-10 
0.266  E-10 
0.264  E-10 
0.185  E-10 
0.153  E-10 


Alg  time 

Dir  time 

init 

eval 

init 

eval 

0.329  E-ll 
0.295  E-ll 
0.127  E-ll 
0.184  E-ll 
0.112  E-ll 
0.714  E-12 
0.488  E-12 
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Remark.  In  order  to  make  a  fair  comparison  of  our  method  with  the  direct  one  in  this 
example,  we  also  split  the  latter  into  an  initialization  stage,  in  which  ail  the  required  complex 
exponentials  are  precomputed  and  stored,  and  an  evaluation  stage  in  which  the  precomputed 
matrix  is  applied  to  the  set  of  coefficients  {a*}.  However,  such  a  formulation  requires  N  ■  m 
storage,  and  for  N,m  >  1024  the  available  memory  on  the  machine  is  insufficient.  For  larger 
problems,  the  direct  scheme  has  to  compute  all  exponentials  as  needed  and  timings  are  pre¬ 
sented  for  such  an  implementation  in  these  cases. 


Example  3. 

In  this  example,  the  frequencies  {wfc}  were  randomly  distributed  on  the  interval  [— N,- N/2] 
for  k  =  1,  JV/2  and  on  [N/2,N]  for  k  =  N/2  4-  1  ,N.  As  in  the  previous  example,  the  points 
{xy}  were  randomly  distributed  on  [ — 7r,  zr]  and  the  coefficients  {a*}  generated  randomly  on 
the  unit  square  3?(a*)  £  [0, 1],  §(afc)  £  [0, 1]. 

The  sums  were  calculated  in  two  ways:  first  using  the  algorithm  directly,  and  then  using 
the  method  of  section  7,  i.e.  viewing  Sa,w  as  a  sum  of  two  separate  sub-series,  applying  the 
algorithm  to  each  and  finally  adding  the  results.  Results  of  these  tests  are  presented  in  Table 
8. 


Table  8:  Numerical  Results  for  Example  3  with  e  =  10  2,  q  =  8 


N 

m 

1  cluster 

2  clusters 

tdir 

KH 

m 

^9 

mu 

128 

256 

Kjfil 

ESI 

0.30 

m 

1.64 

256 

512 

ESI 

EH 

0.48 

Era 

6.69 

512 

1024 

tss 

ESI 

1.02 

Era 

26.29 

1024 

2048 

1.51 

m 

2.16 

m 

101.08 

Remark.  The  timings  for  the  I  cluster  and  2  cluster  case  are  similar  in  this  particular 
example. 


Example  4. 

In  this  example,  the  frequencies  {u/jt}  were  randomly  distributed  on  the  interval  [-3iV/2,-rV] 
for  k  =  1,  N/2  and' on  [N,  3N/2]  for  k  =  N/2  -i-  l,jV.  Once  again  the  points  {x:}  were 
randomly  distributed  on  [— zr,  zr]  and  the  coefficients  {a*}  generated  randomly  on  the  unit 
square  &(afc)  €  [0,  lj,  3(ajfe)  €  [0,  Ij. 

The  sums  were  calculated  in  two  ways  as  in  the  previous  example.  Results  of  these  tests 
are  presented  in  Table  9. 
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Table  9:  Numerical  Results  for  Example  4  with  e  =  10~2,  q  =  8 


N 

m 

1  cluster 

2  clusters 

tdir 

4tntt 

^alq 

+cvaL 
• atg 

Unit 

^alq 

4cvai 

''alg 

00 

256 

0.25 

0.15 

0.30 

0.06 

1.64 

256 

512 

0.50 

0.32 

0.48 

0.18 

6.69 

512 

1024 

1.00 

0.71 

1.02 

0.36 

26.29 

1024 

2048 

2.00 

1.41 

2.16 

0.78 

101.08 

Remark.  In  this  example  the  clusters  of  frequencies  are  more  widely  separated  than  in 
the  previous  example,  and  it  is  advantageous  to  consider  the  two  sub-series  separately  in  this 
case. 


The  results  as  presented  in  the  Tables  2-9  lead  us  to  make  the  following  observations: 

1.  The  accuracy  of  the  results  is  well  within  the  specified  tolerance  level,  t.  For  larger  N,  m 
the  method  tends  to  be  slightly  more  accurate. 

2.  The  break  even  point  of  our  algorithm  with  the  direct  calculation  depends  on  the  toler¬ 
ance,  but  is  at  roughly  N,  m  =  64  if  the  initialization  time  is  ignored.  If  the  initialization 
time  is  included,  the  extrapolated  break  even  point  is  at  N,m  «  32.  For  N,  m  =  4096 
our  scheme  is  about  400  times  faster  than  the  direct  method. 

3.  As  expected,  the  computation  time  grows  slightly  more  slowly  than  linearly  with  the 
numbers  of  terms  and  points. 

4.  Evaluation  of  a  Fourier  series  at  an  arbitrary  set  of  points  is  roughly  4  times  as  expensive 
as  an  FFT  of  the  same  size  for  4-5  digits  of  accuracy. 

5.  In  cases  when  the  frequencies  can  be  separated  into  widely  separated  clusters,  it  is  often 
cheaper  computationally  to  consider  the  sub-series  corresponding  to  each  cluster  sepa¬ 
rately. 
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9  Generalizations  and  Conclusions 

The  results  of  this  paper  can  be  generalized  in  the  following  ways: 

1.  One  of  the  more  far-reaching  extensions  of  the  method  is  a  version  of  the  algorithm  in 
higher  dimensions.  Investigations  into  this  are  currently  in  progress. 

2.  The  Helmholtz  equation  in  2  dimensions  is  given  by 

V2d>  +  k24>  -  0 

and  has  particular  solutions  of  the  form 

4>{x,y)  =  •  eil/y 

where  yp  +■  iP  —  «2.  Solutions  of  this  equation  consist  of  linear  combinations  of  such 
functions  subject  to  some  boundary  conditions,  and  the  results  of  this  paper  admit  a 
generalization  which  constitutes  a  fast  Helmholtz  solver. 

3.  Some  series  of  special  functions  reduce  to  the  form  (1)  and  can  thus  be  rapidly  evaluated 
using  the  algorithm  of  this  paper.  In  addition,  a  set  of  interpolation  techniques  similar 
to  those  used  in  our  scheme  may  be  applied  to  other  orthonormal  bases  of  functions  in 
place  of  trigonometric  polynomials  to  give  fast  algorithms  for  other  integral  transforms. 

In  conclusion,  an  algorithm  has  been  presented  for  the  rapid  evaluation  of  series  of  the  form 
( 1).  The  scheme  is  based  on  piecewise  trigonometric  interpolation  of  such  series,  and  subsequent 
direct  evaluation  of  the  resulting  trigonometric  polynomials.  The  problem  can  be  viewed  as  a 
generalization  of  the  Discrete  Fourier  Transform,  and  the  algorithm,  while  making  use  of  some 
simple  results  from  analysis,  is  very  versatile,  and  has  wide-ranging  potential  applications  in 
many  branches  of  mathematics,  science  and  engineering. 
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